Monte-Carlo Tree Search for the Game of Scotland Yard1

نویسندگان

  • J. Pim
  • A. M. Nijssen
  • Mark H.M. Winands
چکیده

This abstract describes how Monte-Carlo Tree Search (MCTS) [1, 4] can be applied to play the hideand-seek game Scotland Yard. This game is played by 6 players: 5 seekers and 1 hider. The seekers work together to capture the hider by moving one of their pawns to the location occupied by the hider. The game is played on a map consisting of 199 locations, connected by 4 different transportation types. The hider’s location is announced every 5 turns. The seekers always know which transportation type the hider uses. The basic MCTS algorithm is designed for two-player games with perfect information. When using MCTS in a hide-and seek game, which is a game with imperfect information, the algorithm has to be altered slightly. When we use MCTS in Scotland Yard, the seekers can guess the location of the hider at each iteration of the algorithm and place him on any of the empty locations on the board. This group of locations can be limited by removing the locations where the hider cannot be located, based on the old list of possible locations, the current locations of the seekers, and the type of transportation used by the hider. The list of possible locations is updated every move. Some of the possible locations are more probable than others. The performance of the seekers could be improved by biasing the possible locations of the hider. This is done by categorizing the possible locations. These categories are numbered from 1 to L, where L is the number of categories. This technique is called Location Categorization. The type of categorization is game-dependent. For Scotland Yard, we use a categorization based on the distance of the possible location to the nearest seeker. After the hider performs a move, the possible locations are divided into the different categories. There are two ways to store the information about the possible categories and the category of the location of the hider. In the general table, we store for each category the number of times one or more possible locations belonged to the category, n, and the number of times the actual location of the hider belonged to the category, a. This way of storing and using information is similar to the transition probabilities used in Realization Probability Search [7]. In the detailed table, for each possible combination of categories, we store how many times the actual location of the hider belonged to each category. There are two different ways to gather the information for these tables: offline and online. When using offline information gathering, first a large number of games is played and the information is stored in a file. This information can later be used by the seekers. When using online information gathering, the seekers start without any information. At the end of each game, the seekers update the information with the statistics gathered from the last game. The seekers use a vector with length L to select a location for the hider at the start of each MCTS iteration. These values represent the weights of the categories. When using the general table, this vector consists of the values [ a1 n1 , a2 n2 , · · · , aL nL ]. When using the detailed table, this vector is directly taken from the table, by extracting the vector corresponding to the combination of categories. To select a possible location, roulette-wheel selection is used. The size of each possible location on the wheel is corresponding to the value of its category in the vector. Scotland Yard is a cooperative multi-player game. Therefore, the seekers can be considered as one player, making the game essentially a 2-player game. If in a playout one seeker captures the hider, the playout is considered a win for all seekers and the result is backpropagated accordingly. However, when using this backpropagation rule, we observed that seekers sometimes rely too much on the other seekers and do not make any efforts to capture the hider. For solving this problem, we propose Coalition Reduction. If the seeker who is the root player captures the hider, a score of 1 is returned. If another seeker captures the hider, a smaller score, 1− r, is returned, where r ∈ [0, 1].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Monte-Carlo Approximation of Temperature

Monte-Carlo tree search is a powerful paradigm for the game of Go. We propose to use Monte-Carlo tree search to approximate the temperature of a game, using the mean result of the playouts. Experimental results on the sum of five 7x7 Go games show that it improves much on a global search algorithm.

متن کامل

Efficient Sampling Method for Monte Carlo Tree Search Problem

We consider Monte Carlo tree search problem, a variant of Min-Max tree search problem where the score of each leaf is the expectation of some Bernoulli variables and not explicitly given but can be estimated through (random) playouts. The goal of this problem is, given a game tree and an oracle that returns an outcome of a playout, to find a child node of the root which attains an approximate m...

متن کامل

Revisiting Monte-Carlo Tree Search on a Normal Form Game: NoGo

We revisit Monte-Carlo Tree Search on a recent game, termed NoGo. Our goal is to check if known results in Computer-Go and various other games are general enough for being applied directly on a new game. We also test if the known limitations of Monte-Carlo Tree Search also hold in this case and which improvements of Monte-Carlo Tree Search are necessary for good performance and which have a min...

متن کامل

Monte-Carlo Tree Search for General Game Playing

We present a game engine for general game playing based on UCT, a combination of Monte-Carlo and tree search. The resulting program is named ARY. Despite the modest number of random games played by ARY before choosing a move, it scored quite well in the qualifying phase of the annual general game playing tournament hosted by AAAI.

متن کامل

A Risky Proposal: Designing a Risk Game Playing Agent

Monte Carlo Tree Search methods provide a general framework for modeling decision problems by randomly sampling the decision space and constructing a search tree according to the sampling results. Artificial Intelligences employing these methods in games with massive decision spaces such as Go and Settlers of Cataan have recently demonstrated far superior results compared to the previous classi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011